Goto

Collaborating Authors

 data exposure


Thousands of Vibe-Coded Apps Expose Corporate and Personal Data on the Open Web

WIRED

Companies like Lovable, Base44, Replit, and Netlify use AI to let anyone build a web app in seconds--and in thousands of cases, spill highly sensitive data onto the public internet. As AI increasingly takes over the work of modern programmers, the cybersecurity world has warned that automated coding tools are sure to introduce a new bounty of hackable bugs into software. When those same vibe-coding tools invite anyone to create applications hosted on the web with a click, however, it turns out the security implications go beyond bugs to a total absence of any security--even, sometimes, for highly sensitive corporate and personal data. Security researcher Dor Zvi and his team at the cybersecurity firm he cofounded, RedAccess, analyzed thousands of vibe-coded web applications created using the AI software development tools Lovable, Replit, Base44, and Netlify and found more than 5,000 of them that had virtually no security or authentication of any kind. Many of these web apps allowed anyone who merely finds their web URL to access the apps and their data.


An AI Toy Exposed 50,000 Logs of Its Chats With Kids to Anyone With a Gmail Account

WIRED

AI chat toy company Bondu left its web console almost entirely unprotected. Researchers who accessed it found nearly all the conversations children had with the company's stuffed animals. Earlier this month, Joseph Thacker's neighbor mentioned to him that she'd preordered a couple of stuffed dinosaur toys for her children. She'd chosen the toys, called Bondus, because they offered an AI chat feature that lets children talk to the toy like a kind of machine-learning-enabled imaginary friend. But she knew Thacker, a security researcher, had done work on AI risks for kids, and she was curious about his thoughts.


Protect Your Secrets: Understanding and Measuring Data Exposure in VSCode Extensions

arXiv.org Artificial Intelligence

Recent years have witnessed the emerging trend of extensions in modern Integrated Development Environments (IDEs) like Visual Studio Code (VSCode) that significantly enhance developer productivity. Especially, popular AI coding assistants like GitHub Copilot and Tabnine provide conveniences like automated code completion and debugging. While these extensions offer numerous benefits, they may introduce privacy and security concerns to software developers. However, there is no existing work that systematically analyzes the security and privacy concerns, including the risks of data exposure in VSCode extensions. In this paper, we investigate on the security issues of cross-extension interactions in VSCode and shed light on the vulnerabilities caused by data exposure among different extensions. Our study uncovers high-impact security flaws that could allow adversaries to stealthily acquire or manipulate credential-related data (e.g., passwords, API keys, access tokens) from other extensions if not properly handled by extension vendors. To measure their prevalence, we design a novel automated risk detection framework that leverages program analysis and natural language processing techniques to automatically identify potential risks in VSCode extensions. By applying our tool to 27,261 real-world VSCode extensions, we discover that 8.5% of them (i.e., 2,325 extensions) are exposed to credential-related data leakage through various vectors, such as commands, user input, and configurations. Our study sheds light on the security challenges and flaws of the extension-in-IDE paradigm and provides suggestions and recommendations for improving the security of VSCode extensions and mitigating the risks of data exposure.


Data Exposure from LLM Apps: An In-depth Investigation of OpenAI's GPTs

arXiv.org Artificial Intelligence

LLM app ecosystems are quickly maturing and supporting a wide range of use cases, which requires them to collect excessive user data. Given that the LLM apps are developed by third-parties and that anecdotal evidence suggests LLM platforms currently do not strictly enforce their policies, user data shared with arbitrary third-parties poses a significant privacy risk. In this paper we aim to bring transparency in data practices of LLM apps. As a case study, we study OpenAI's GPT app ecosystem. We develop an LLM-based framework to conduct the static analysis of natural language-based source code of GPTs and their Actions (external services) to characterize their data collection practices. Our findings indicate that Actions collect expansive data about users, including sensitive information prohibited by OpenAI, such as passwords. We find that some Actions, including related to advertising and analytics, are embedded in multiple GPTs, which allow them to track user activities across GPTs. Additionally, co-occurrence of Actions exposes as much as 9.5x more data to them, than it is exposed to individual Actions. Lastly, we develop an LLM-based privacy policy analysis framework to automatically check the consistency of data collection by Actions with disclosures in their privacy policies. Our measurements indicate that the disclosures for most of the collected data types are omitted in privacy policies, with only 5.8% of Actions clearly disclosing their data collection practices.


Hundreds of millions of Facebook records exposed on public servers โ€“ report

The Guardian

More than 540m Facebook records were left exposed on public internet servers, cybersecurity researchers said on Wednesday, in just the latest security black eye for the company. Researchers for the firm UpGuard discovered two separate sets of Facebook user data on public Amazon cloud servers, the company detailed in a blogpost. One dataset, linked to the Mexican media company Cultura Colectiva, contained more than 540m records, including comments, likes, reactions, account names, Facebook IDs and more. The other set, linked to a defunct Facebook app called At the Pool, was significantly smaller, but contained plaintext passwords for 22,000 users. The large dataset was secured on Wednesday after Bloomberg, which first reported the leak, contacted Facebook.